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Data have proliferated in seemingly every area of life. “Big 
data” and the algorithms that make sense of them have 
revolutionized fields like medicine and business! and have 
led to the rise of data analytics and data science, which use 
visualization, algorithms, and other data analysis techniques 
to extract insights from data and drive decisionmaking.” 
Postsecondary education, too, is experiencing its own data 
deluge. Online applications now capture fine-grained teaching 
and learning data on a large scale,3 and large quantities of 
institutional data have been collected in response to increased 
external pressure for accountability in higher education.4 


Student success is a key area where these data can be put 
to use. Student success can be defined broadly to include 


Key Findings 

- Data analytics and data science can 
address challenges to student success. 

- Postsecondary institutions have already 
creatively used analytics to address the 
problem of college completion through 
innovations such as academic early 
warning systems and adaptive learning 
technologies. 

- Adiverse array of data on postsecondary 
education exists both within and 
outside of institutions and can be used 
in analytics to provide a richer view of 
student success and improve equity. 

- Increased data collection and analysis 
open up the challenges of data linkage 
across units and the risk of ethical and 
privacy violations, which deserve more 
attention. 


students’ personal development and goals; however, efforts 

to foster student success have generally focused on students’ 
academic performance.° College completion in particular has 
gained significant attention since it was made a focus of the 
Obama administration's higher education policy. However, 
although completion rates have improved, concerns about 
equity have not been adequately addressed.® Other issues also 
affect students’ success. Costs of college and debt burdens are 
both rising,’ as is anxiety about the value of a postsecondary 
credential in the labor market.8 And even with the abundance 
of available data, providing relevant, timely information to 
the students and families who need it remains a challenge.? 
With their ability to deliver insights quickly from large troves 
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of data, data analytics and data science have the potential 
to address some of the most pressing challenges to student 
success. 


In this brief, I discuss the potential of new data sources in 
postsecondary education for data analytics and data science 
focused on student success, with the aim of sharing this 
knowledge with stakeholders in postsecondary education and 
with data scientists. Postsecondary stakeholders could benefit 
from understanding the new possibilities that data science 
may provide for addressing student success issues, and data 
scientists could benefit from knowing about the opportunities 
and analytic questions in postsecondary education. I first 
describe how institutions have applied data analytics to student 
academic data to address problems related to completion. 

I then discuss the range of data that exist both within and 
outside institutions and provide a taxonomy of how these data 
may be used. I argue that incorporating a broader array of 
data into analytics could provide richer answers to questions 
about how to improve student success. Finally, I conclude with 
a discussion of the practical and ethical issues that accompany 
increased data collection and analysis. 


Academic Analytics for College Completion 


In recent years, more and more postsecondary institutions 
have begun using analytics to aid in institutional operations; 
this growing field is termed “academic analytics.”!? Although 
the goals of academic analytics have been diverse, serving 
postsecondary institutions’ business and accountability goals 
in admissions, fundraising, and student affairs,!! a major 
area of application has been in student success, specifically 
college completion. I discuss examples of these applications 
subsequently. 


Georgia State University has been recognized as a leading 
institution on this front. Using predictive analytics, among 
other innovations, it raised its 6-year graduation rate from 

32 percent in 2003 to 55 percent in 2018 and reduced 
graduation rate gaps between disadvantaged and non- 
disadvantaged students.!2 At the core of this strategy is a 
predictive analytics system tracking more than 800 risk factors 
of dropout.!2,!3 A team of advisers monitors this system to 
provide targeted help to students.!2:3 Informed by insights 
from this predictive analytics system, the university has also 
clarified student academic pathways and made changes to the 
structure of course requirements. 


Other institutions also have developed tools to facilitate 
students’ academic progress. At Austin Peay State University 

in Tennessee, a tool called Degree Compass provides “Netflix- 
style” course recommendations for students that account for 
their interests, program requirements, and past performance to 
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help them tackle the challenge of choosing courses from a wide 
array of options.!4 Tools like Course Signals at Purdue and 
ECoach at the University of Michigan help students manage 
their performance within courses by providing personalized 
advice and early warnings of poor performance.!5.16 
Universities have often partnered with external vendors 

to build these tools. The “Check My Activity” tool at the 
University of Maryland, Baltimore County, was developed 

in collaboration with Blackboard, a course management 
system, and allows students to compare their level of activity 
in Blackboard against an anonymous summary of their course 
peers’ performance.!” 


Alongside work in academic analytics, work in the field of 
learning analytics has focused on using data-driven tools 
and models to analyze online learning behavior and provide 
personalized solutions to students’ learning challenges.!® 
This work has led to the development of adaptive learning 
technologies, which use machine learning to adapt online 
learning experiences in real time to each student’s individual 
skills and understanding of the material.!° Institutions 

have begun to implement these technologies: for example, 
the University of Central Florida and Colorado Technical 
University partnered with Realizeit, an online learning vendor, 
to pilot an adaptive learning platform in select courses. Early 
evidence shows improvement in student outcomes, such as 
greater engagement with course concepts.20 


Alternative Data Sources and Their Applications 
for Student Success 


The above examples demonstrate that higher education 
institutions have made strides in harnessing student academic 
data and classroom and learning data to promote student 
success and completion. However, barriers to completion are 
often nonacademic: many students face challenges such as food 
and housing insecurity,2! as well as employment and family 
demands that compete for their time and hinder their ability to 
obtain a credential.?? Linking students’ nonacademic records 
to their academic records could provide insight into the ways 
these challenges systematically affect academic progress, and 
could also be used as early warning indicators. Institutions 
could also evaluate the effectiveness of the student services 
they offer using linked student services and academic data. 


Digital tools also could offer solutions directly to students; 
for example, such tools could connect students to available 
housing or sources of food, or help students navigate 

services the institution already provides. Many schools offer 
emergency aid, for instance, but students are often unaware 
of such programs.”3 Researchers have begun to make creative 
use of new data sources within institutions to study aspects 
of students’ experiences outside of class that can affect 
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completion. Studies of the digital traces of campus social life, 
such as key card swipes into campus buildings, have examined 
their potential to predict student engagement and integration, 
which are ultimately crucial to retention and completion.24 


To make sense of the growing diversity of data sources, 

Table 1 displays the types of data sources available within 
postsecondary institutions, as well as student success issues 
that can be addressed by each data type. Classroom and 
learning data pertain to higher education’s core function of 
educating students and measuring learning. Student academic 
records measure students’ progress through their academic 
programs. Student nonacademic records pertain to students 
themselves and to the nonacademic aspects of students’ lives. 
Institutional unit data relate to the operations of an institution's 
academic programs and business units. 


In addition, a vast amount of postsecondary education data 
are being collected outside of individual institutions, and these 
data, and the tools and analyses that make sense of them, can 
also be brought to bear on student success. Diverse sources 

of data can provide a broader, more holistic understanding 

of what student success entails and the factors that shape it: 
student success could encompass not only the completion of 

a degree or certificate program, but also skills or knowledge 
acquisition, finding a job, and socioeconomic mobility. 


Data linking academic experiences to workforce and economic 
data can be brought to bear on these issues. State governments 
have begun to build State Longitudinal Data Systems that 

link unit-level K-12, postsecondary, and workforce data, 

as well as tools and dashboards to explore these troves of 
data.® Information from these systems connecting students’ 
college experiences to their labor market outcomes would 

be invaluable in informing job-seeking students, as well as 
researchers and policymakers, about which courses, majors, 
and other experiences can lead to certain occupations or 
industries. Meanwhile, the economic aspects of postsecondary 
education have well-documented effects on student outcomes,’ 
but despite the abundance of data displayed in online net 
price calculators25 and college search websites for prospective 
students,2%27 students still suffer from a lack of clear and 
targeted information that would enable them to navigate 

this complex financial landscape.? These comprehensive 

data on costs, tuition, and financial aid could be made more 
interpretable and personalized, and could also be integrated 
into institutional systems to better inform students of the 
personal financial implications of their degree plans. 


Table 2 below displays the postsecondary data topics, data 
sources, and student success issues that can be addressed by 
data sources across and outside of postsecondary institutions. 
Institution-level data provide information about a given 


Table 1. Data within postsecondary institutions and student success issues they can address 


Data Type Data Sources 


Classroom and learning data 


Student academic records 
Clearinghouse 


Student nonacademic records 


Institutional unit data 
institutions 


Online learning platforms, learning 
management systems, instructors’ records 
Institutions, transcripts, National Student 


Institutions, financial aid organizations 


Academic programs and business units within 


Student Success Issues 


Student learning; curricular improvement; 
classroom environment and peer effects 


Courses that are barriers to completion; 
academic pathways; personalized advising 


Student social life and integration; connection 
to resources for basic needs; usage and 
effectiveness of student services 


Composition of student body within units; 
differences in student outcomes across units 


Table 2. Data across and outside of postsecondary institutions and student success issues they can address 


Data Type Data Sources 


Institution-level data 


Data on linkages to K-12 and to workforce 


Governance data 


Association [SHEEO]) 
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College Scorecard, Integrated Postsecondary 
Education Data System (IPEDS) 


State Longitudinal Data Systems, employers 


Governments, professional associations (e.g., 
State Higher Education Executive Officers 


Student Success Issues 


Differences between institutions in graduation 
and retention rates; differences between 
institutions in tuition, financial aid, and debt 


High school preparation for college success; 
student employment and earnings; credentials 
or experiences related to careers 


Budget appropriations for educational 
institutions; laws and policies across education 
systems 
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postsecondary institution and its performance relative to other 
institutions. Data on linkages to K-12 and to workforce provide 
information on students’ transitions between education levels 
and the workforce, and how postsecondary education relates 
to K-12 education and workforce needs. Governance data 
pertain to the activities of leaders and governments that affect 
postsecondary education. 


Finally, higher education must continue to address the issue of 
equity, both within and between postsecondary institutions. 
Inequalities in student outcomes by race and class persist, 

as do resource differentials between institutions,® and 
institution-level data and institutional unit data on student 
outcomes can illuminate the patterns and causes of differences 
at higher levels of analysis. Institutional Mobility Report 
Cards, for instance, use tax records linked to institution-level 
data from the Department of Education's College Scorecard 

to characterize an institution’s contribution to promoting 
intergenerational income mobility for its students.28 
Governance data, such as data on federal funding for higher 
education institutions, have received relatively little attention 
in their implications for inequality, but deserve more focus.” 
Along similar lines, work also should focus on ensuring that 
institutions with fewer resources can still implement and reap 
the benefits of robust data analytics for their students. 


Challenges 


“Big data,” analytics, and data science show immense promise 
for working toward student success in postsecondary 
education. Work should continue to develop and expand 

the application of these approaches to understanding and 
addressing both academic and nonacademic issues in students’ 
lives, as well as to alleviating inequities. However, large-scale 
data collection, linkage, and analysis also open practical and 
ethical challenges. 


One important challenge lies in building the capacity and 
infrastructure to create data linkages across institutional 
units or organizations. Data necessary for an analysis may be 
scattered across separate entities that do not communicate 
with each other, contributing to disorganized, inconsistent, 
or incomplete data.2? Higher education institutions should 
continue to build and expand the infrastructure and 
capacity necessary to collect and analyze data, working 
within information systems and institutional cultures 

to align data collection standards, build knowledge, and 
foster the willingness to adopt new methods.3° Toward 

this end, the Gates Foundation and the Institute for Higher 
Education Policy (IHEP) have spearheaded an effort to map 
and standardize data reporting on college students’ entire 
postsecondary trajectory.3! Along similar lines, a closer 


RTI Press: Research Brief 


dialogue across sectors, for example between higher education 
administrators, researchers, policymakers, and data scientists, 
could develop a shared understanding that would encourage 
progress and innovation in using data to help address the most 
pressing problems hindering student success. 


Large-scale data collection and analysis also raise the issues 
of data privacy and ethical data collection and use. Even as 
institutions collect reams of data on their students, students 
themselves often are unaware of this fact and have not 

been given the opportunity to provide informed consent 

or opt out.32,33 It is imperative that administrators justify 

the data they collect by weighing the potential benefits to 
students against the potential harms, actively communicate 
these considerations to students, and allow students 

some real measure of control over what happens to their 
information.* The risks of large-scale student data collection 
can be substantial: as more and more data are collected on 
individuals, individuals are increasingly vulnerable to the 
negative consequences of data breaches caused by improper 
data collection, storage, and analysis practices. These effects are 
magnified when multiple data sources are linked, potentially 
making individuals easier to identify.!! In addition, biased 
algorithms can unfairly profile individuals on the basis of 
their characteristics.!! While work has called for more serious 
consideration of the legal and ethical implications of large- 
scale student data collection and linkages,!!,32,33,34 more 
institutions and stakeholders need to take heed and develop 
guidelines and policies to ensure transparency, consent, and 
fairness. Above all, stakeholders in postsecondary education 
must not lose sight of ethical considerations as they explore the 
potential of data analytics for improving student success. 
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